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Background of the Invention 

[0001] The present invention relates to multiplier- 

accumulator ("MAC") blocks, and more particularly, the 

10 present invention relates to a more efficient way in 
which to make use of multipliers in a MAC block. 
[0002] A MAC block, sometimes referred to as a 
digital signal processing ("DSP") block is DSP 
circuitry that implements a group of multipliers and 

15 other components such as arithmetic components. MAC 

blocks may be used in the processing of many different 
types of applications, including graphics applications, 
networking applications , communications applications , 
and video applications. Because of the versatility of 

2 0 MAC blocks, and of multipliers in general, 

manufacturers of programmable logic devices, such as 



- 2 - 



Altera® Corporation of San Jose, California, have 
recently begun manufacturing programmable logic devices 
that, in addition to programmable logic circuitry, also 
contain hardware DSP circuitry in the form of MAC 
5 blocks. The MAC blocks of programmable logic devices 
provide a way in which certain functionality of a 
user's design may be implemented using less space on 
the programmable logic device and result in a faster 
execution time because of the nature of DSP circuitry 

10 relative to programmable logic circuitry. 

[0003] MAC blocks are made of a number of 

multipliers and adders. Whenever one or more of the 
multipliers in a particular MAC block need to be used, 
the entire MAC block is placed into a mode of operation 

15 based on how many of the multipliers are to be used for 
the particular implementation. For example, if the MAC 
block contains a total of four 18 bit by 18 bit 
multipliers, and if a particular design requires the 
use of a single 18 bit by 18 bit multiplier, then the 

2 0 MAC block is put into a mode of operation such that 

each of the 18 bit by 18 bit multipliers can only be 
used individually in an 18 bit by 18 bit multiply mode. 
Therefore, the remaining three multipliers are limited 
for use only in 18 bit by 18 bit multiply modes. This 
25 results in an inefficient limitation on the potential 
use of the remaining multipliers in the MAC block. 
[0004] It would therefore be desirable to implement 

a MAC block such that the multipliers in the MAC block 
may be used in different modes of operation 

3 0 simultaneously. 
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Summary of the Invention 

[0005] It is therefore an object of the present 

invention to provide a MAC block in which mode 
splitting among the multipliers in the MAC block may be 
5 enabled. 

[0006] This and other objects of the present 

invention are accomplished by providing a programmable 
logic device having one or more MAC blocks in which 
different modes may be implemented simultaneously. The 

10 multipliers and other DSP circuitry (e.g., arithmetic 
circuitry such as adders) that make up a MAC block may 
be allocated among different modes of operation at any 
particular point in time. For example, in a preferred 
arrangement of a MAC block have four 18 bit by 18 bit 

15 multipliers, one 18 bit by 18 bit multiplier may be 
used to implement an 18 bit by 18 bit multiply mode, 
while two other multipliers may be used to implement 
the sum of two 18 bit by 18 bit multiplications mode. 
Any such suitable modes may be implemented 

20 simultaneously based on available resources. 

[0007] Any suitable control signals and control 
circuitry may be used to control which modes are to be 
implemented in the MAC block. Control signals may, for 
example, indicate whether the output of a particular 

25 multiplier is to be input into an adder/subtracter 

based on whether the mode being implemented requires 
such circuitry. Because any suitable modes may be 
implemented in accordance with the present invention, 
it will be understood that any suitable control signals 

30 and control circuitry may be used. It will further be 
understood that different control signals and different 
control circuitry may be used to implement the same 
modes . 
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Brief Description of the Drawings 

[0008] The above and other objects of the present 

invention will be apparent upon consideration of the 
following detailed description, taken in conjunction 
5 with the accompanying drawings, in which like reference 
characters refer to like parts throughout, and in 
which: 

[0009] FIG. 1 is a schematic representation of an 

illustrative MAC block; 
10 [0010] FIG. 2 is a block diagram of an illustrative 

MAC block in which four n bit by n bit multipliers are 
implemented as four n bit by n bit multipliers; 

[0011] FIG. 3 is a block diagram of an illustrative 

MAC block in which four n bit by n bit multipliers are 
15 implemented as eight n/2 bit by n/2 bit multipliers; 

[0012] FIG. 4 is a schematic diagram of an 

illustrative 18 bit by 18 bit multiply mode 
implementation in accordance with the present 
invention; 

20 [0013] FIG. 5 is a schematic diagram of an 

illustrative 52 bit accumulate mode implementation in 
accordance with the present invention; 
[0014] FIG. 6 is a schematic diagram of an 

illustrative sum of two 18 bit by 18 bit 

2 5 multiplications mode implementation in accordance with 

the present invention; 

[0015] FIG. 7 is a schematic diagram of an 

illustrative sum of four 18 bit by 18 bit 
multiplications mode implementation in accordance with 

3 0 the present invention; 

[0016] FIG. 8 is a schematic diagram of an 

illustrative 9 bit by 9 bit multiply mode 
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implementation in accordance with the present 
invention; 

[0017] FIG. 9 is a schematic diagram of an 

illustrative sum of two 9 bit by 9 bit multiplications 
5 mode implementation in accordance with the present 
invention; 

[0018] FIG. 10 is a schematic diagram of an 

illustrative sum of four 9 bit by 9 bit multiplications 
mode implementation in accordance with the present 
10 invention; 

[0019] FIG. 11 is a schematic diagram of an 

illustrative 36 bit by 36 bit multiply mode 
implementation in accordance with the present 
invention; 

15 [0020] FIG. 12 is a block diagram of a MAC block 

having illustrative control signals in accordance with 
the present invention; 

[0021] FIG. 13 is a block diagram of an illustrative 

programmable logic device having at least one MAC block 
2 0 in accordance with the present invention; and 

[0022] FIG. 14 is a block diagram of an illustrative 

system employing a programmable logic device in 
accordance with the present invention. 

2 5 Detailed Description of the Invention 

[0023] The present invention provides a MAC block 
that allows its multipliers, other circuitry, or both 
to be split among one or more modes of operation 
simultaneously. One or more multipliers of the MAC 

30 block may be designated to operate in one mode (e.g., a 
multiply mode) whereas one or more other multipliers of 
the MAC block may be designated to operate in another 
mode (e.g., sum of multipliers mode). The present 
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invention allows a single MAC block to support 
different modes that require different numbers of 
multipliers. For example, two multipliers may be used 
in one mode, whereas only one multiplier may be used in 
5 a second mode. 

[0024] The present invention is particularly 

applicable to programmable logic devices that include 
integrated DSP circuitry. Because of the need for 
flexibility from such devices, allowing a MAC block to 

10 operate in more than one mode simultaneously allows for 
more efficient use of the DSP resources available 
within a particular programmable logic device. 
[0025] Allowing a MAC block to operate in more than 

one different mode may be accomplished by using any 

15 suitable circuitry and any suitable control signals. 
[0026] A MAC block according to the present 
invention may operate in any suitable modes. For 
example, in the case of a MAC block having four 18 bit 
by 18 bit multipliers, meaning that each can determine, 

2 0 as a 3 6 -bit binary output, the product of two 18 -bit 

binary multiplicand inputs, or the two products 
(concatenated into one 36-bit number) of two pairs 
of 9-bit binary multiplicand inputs (concatenated into 
one pair of 18 -bit numbers) , suitable modes include, 
25 but are not limited to, for example, an 18 bit by 18 
bit multiplier, a 52 bit accumulator, an accumulator 
initialization, a sum of two 18 bit by 18 bit 
multipliers, a sum of four 18 bit by 18 bit 
multipliers, a 9 bit by 9 bit multiplier, a sum of 

3 0 two 9 bit by 9 bit multipliers, a sum of four 9 bit 

by 9 bit multipliers, a 36 bit by 36 bit multiplier, or 
any other suitable modes. The listed modes are 
sometimes referred to herein as modes 1-8, 
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respectively, with the accumulator initialization being 
mode 2A. It will be understood that these are merely 
illustrative modes that may be supported by a MAC block 
in accordance with the present invention. Any other 
5 suitable modes may by supported. Such support of modes 
may be determined based on any suitable factors, 
including, for example, application needs, size of 
available multipliers, number of multipliers, or any 
other suitable factors. For example, it is clear that 
10 if a MAC block included eight 9 bit by 9 bit 

multipliers, different modes may be used (e.g., sum of 
eight 9 bit by 9 bit multipliers) . 

[0027] Different multipliers of a MAC block may be 

used in different modes simultaneously to avoid the 
15 situation where a particular mode makes use of 

relatively few multipliers of a MAC block, leaving the 
other multipliers idle. 

[0028] In some embodiments of the present invention, 

a MAC block may be split into two or more sections of 

20 multipliers. Modes may be designated according to 

section, whereby all the multipliers in a section of 
multipliers are operating in the same mode. This 
arrangement may provide a more simple organization of 
control signals and provides a balance between 

25 flexibility and simplicity. Sections may be defined 
based on modes that are desired to be used. For 
example, if all multipliers of a MAC block are to be 
used in a particular mode, then splitting will not 
occur. If half the multipliers are needed for a 

3 0 particular mode, then the MAC block may be split such 
that there are two sections, each having half of the 
multipliers. Each of the two sections may then be 
operated under a different mode if desired. In one 
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suitable approach, a section may be further split. For 
example, a MAC block may be split among three modes 
where one of the modes uses half of the multipliers, a 
second mode uses a quarter of the multipliers, and a 
5 third mode uses a quarter of the multipliers. A MAC 
block may be split among four modes where each mode 
uses one quarter of the available multipliers. Any 
such suitable mode splitting may be done in accordance 
with the present invention. If all the multipliers of 
10 a MAC block are required, then the MAC block will 
operate under a single mode. 

[0029] For purpose of brevity and clarity, and not 

by way of limitation, and without loss of generality, 
the present invention is primarily described herein in 

15 terms of a MAC block made of four multipliers of 18 bit 
by 18 bit size. The illustrative nature of this 
arrangement will be appreciated and it will be 
understood that the teachings of the present invention 
may be applied to any other suitable type of MAC block 

2 0 having any suitable arrangement of component 
circuitries . 

[0030] FIG. 1 shows the circuitry of an illustrative 

embodiment of a MAC block 10 of the type described 
above. MAC block 10, having inputs 101-108, includes 

25 four multipliers 11, 12, 13 and 14. Each of 
multipliers 11-14 may be an 18 bit by 18 bit 
multiplier. Each MAC block 10 preferably also includes 
a number of adder/subtracters 15, 16, 17 and 18 
allowing the performance of addition and subtraction of 

30 the outputs of the various multipliers 11-14, as well 
as an accumulator function. 

[0031] Multiplexers 119 allow the' various 

multipliers 11-14 to share one input 101. Similarly, 
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multiplexers 109 and registers 110 allow each of 
multiplier inputs 111, 112, 121, 122, 131, 132, 
141, 142 to be registered or unregistered. In 
addition, registers 110, when used with 
5 multiplexers 190, can form input shift register chains 
that allow data to be entered serially. Such input 
shift register chains can even extend to other 
different specialized multiplier. The various 
functions may be output 

10 at 160, 165, 170, 175, 180, 185, 195. 

[0032] According to the invention, if a user design 
includes multiplication and other arithmetic circuit 
elements, those elements, which may be referred to as 
"MAC elements", preferably are automatically grouped 

15 into a MAC block such as MAC block 10. MAC elements 

grouped together may perform, within the MAC block, the 
specialized functions of multiplication, multiplication 
followed by addition, multiplication followed by 
subtraction, and multiplication followed by 

2 0 accumulation. 

[0033] In FIG. 2, a vertically-arranged four 

multiplier-based organization of a MAC block is shown. 
Four multiplier circuits 136 may be stacked vertically 
to potentially operate in parallel. Each multiplier 

25 circuit 136 may include an n bits by n bits multiplier 
(e.g., 18 bit by 18 bit multiplier) to provide an n 
bits by n bits multiplication product. The inputs of 
each multiplier circuit may be fed up to n bits of 
information for the multiplicand and for the multiplier 

30 for the multiplier operation. Each multiplier 

circuit 136 may have an output that may be 2n-bits 
wide. Each multiplier circuit 136 may feed an output 
downstream that is the result of a multiplication 
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operation. Each n bits by n bits multiplier 
circuit 136 may support two's complement signed or 
unsigned multiplication. Dynamic signed/unsigned 
control inputs 156 may receive input signals that 
5 control the sign of the multipliers and the 

multiplicands for the multiplier operations of 
multiplier circuits 136. 

[0034] MAC block 192 may include three sets of 
register circuits. MAC block 192 may include input 

10 register circuits 134, pipeline register circuits, and 
output register circuit 154. If desired, additional 
pipeline register circuits may be included inside 
multiplier circuits 136, inside add- subtract -accumulate 
circuits 144, and/or inside add-subtract circuits 140 

15 to increase speed. Output register circuit 154 may 

include approximately the same number of registers that 
are in input register circuits 134. The number of 
registers that are included in output register 
circuit 154 may be sufficient to register the output of 

20 MAC block 192 (e.g., register the output of MAC 

block 192 for all of the modes that are supported by 
MAC block 192) . The number of output registers may be 
less than, equal to, or greater than the number of the 
input registers depending on what implementation or 

25 architecture is being used for MAC block 192 or 

depending on the range of functionality that is being 
provided by MAC block 192. 

[0035] For clarity and brevity, pipeline register 

circuits are not shown in FIG. 2 and are not shown in 
30 some of the other FIGS, described herein. As mentioned 
above, input register circuits 134, pipeline register 
circuit, or output register circuit 154 may be included 
in MAC block 192 if desired. Independent sets of clock 
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and clear signals 158 may be provided for input 
register circuits 134, the pipeline register circuit, 
or output register circuit 154. Two sets of clock and 
clear signals 158 may be provided for the input 
5 register circuits 134 and the pipeline register 

circuits, and two sets may be provided for output 
register circuit 154. Input register circuits 134 may 
include scan chains and may include additional 
circuitry to be used with the scan chains to allow the 

10 scan chains to be used as logic in some digital signal 
processing functions such as in providing FIR filters. 
Input register circuits 134 may include 8n registers 
(e.g., 144 registers) for 8n data inputs and q 
registers (e.g., 4 registers) for signed/unsigned 

15 control of multiplier circuits 136 and for add-subtract 
control of add- subtract -accumulate circuits 144. Each 
register may have programmable inversion capability to 
provide logic inversion, when desired, or to invert 
unused bits of register inputs when an input for a 

20 multiplier has less than n bits. 

[0036] Output register circuit 154 may have feedback 
paths 161 to add-subtract-accumulate circuits 144 for 
accumulation operations. Any one of the three sets of 
registers, input register circuit 134, the pipeline 

25 register circuit, and output register circuit 154 may 
be bypassed using programmable logic connectors 
("PLCs") in those circuits that may be controlled by 
random access memory control. The pipeline register 
circuit may include approximately the same number of 

30 registers as input register circuits 134. 

[0037] Interface circuitry 133 shown to the left of 

MAC block 192 may feed the inputs of MAC block 192, 
which may be the inputs of input register circuits 134 . 
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Input register circuits 134 may include eight input 
registers that each have n bit inputs and that feed the 
inputs of the four n bit by n bits multiplier 
circuits 136. 

5 [0038] Add- subtract -accumulate circuits 144 may have 
connections for receiving inputs from multiplier 
circuits 136 and from return paths 161. If desired, 
add-subtract -accumulate circuits 144 may be configured 
to pass the outputs from multiplier circuits 136 to 

10 adder circuit 140. The outputs of multiplier 
circuits 136 may be routed to output selection 
circuit 152 or output register circuit 154 without 
being routed through add-subtract -accumulate 
circuits 144 and/or add-subtract circuit 140. For the 

15 purposes of clarity and brevity and not by way of 

limitation and without loss of generality, add-subtract 
circuit 140 is described herein primarily in the 
context of an adder circuit. Add-subtract-accumulate 
circuits 144 may each be configured to perform a two's 

2 0 complement addition of two 2n bit inputs to produce 
a 2n + 1 bit output. Add-subtract-accumulate 
circuits 144 may each be configured to perform a two's 
complement subtraction of two 2n bit inputs to produce 
a 2n + 1 bit output. Add-subtract-accumulate 

2 5 circuits 144 may each be configured to perform an 
accumulation of one 2n bit input with an n+y bit 
output. Dynamic add/subtract control inputs 162 
and 164 may be inputs to add-subtract-circuits 144 that 
are used to switch between addition and subtraction 

30 operations and to handle complex multiplications. 

Dynamic add/subtract inputs 162 and 164 may be needed 
for complex multiplications, which involves 
multiplications involving complex numbers. Complex 



- 13 - 



multiplication of two complex numbers may sometimes 
involve both an addition operation and a subtraction 
operation . 

[0039] The outputs of add-subtract-accumulate 

5 circuits 144 may be routed to output selection 

circuit 152 or output register 154 without being routed 
through adder circuit 140. If desired, adder 
circuit 14 0 may be configured to pass inputs from add- 
subtract-accumulate circuits 144 (e.g., n+1 bit output 

10 of two's complement addition, n+y bit output of 

accumulation, etc.). Adder circuit 140 may have an 
output that is the resultant of the addition of the 
outputs from add-subtract-accumulate circuits 144. 
Output selection circuit 152 may have inputs that are 

15 from adder circuit 140. Output selection circuit 152 

may select which ones of the inputs of output selection 
circuits 152 are passed to output register circuit 154. 
Output register circuit 154 may feed the inputs of 
interface circuitry 133 shown to the right of MAC 

20 block 192. The percent of local interconnect resources 
that is allocated for connecting the circuits in MAC 
block 192 increases as the complexity and the 
variations in digital signal processing functionality 
increases from left to right in MAC block 192. 

25 [0040] With reference to FIG. 2, the "top half" of 
MAC block 192 may include, among other components, the 
two multipliers 13 6 and adder/subtracter 144 shown at 
the top of MAC block 192. The "bottom half" of MAC 
block 192 may include, among other components, the two 

30 multipliers 136 and adder/subtracter showing at the 
bottom of MAC block 192. 

[0041] MAC block 192 may be configured to have an 
n/2 bits by n/2 bits multiplier based organization. 
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For example, with reference now to FIG. 3, MAC 
block 192 may include multiplier circuits 136 that are 
configured to include eight n/2 bits by n/2 bits 
multipliers. The eight n/2 bits by n/2 bits 
5 multipliers may be configured from the four n bits by n 
bits multipliers of multiplier circuits 136 of FIG. 2. 
[0042] If desired, MAC block 192 may be implemented 

to be able to be configured to have a p bits by p bits 
multiplier based organization and to have one or more 

10 p/m bits by p/m bits multiplier based organizations 

where p, m, and p/m are integers. As mentioned above, 
this architecture is at least partially based on the 
limitations of the local interconnect resources. The 
different organizations may be selectable and MAC 

15 block 192 may be capable of being configured into some 
or all of the p/m bits by p/m bits multiplier based 
organizations . 

[0043] MAC block 192 may include add-subtract- 

accumulate circuits 144 configured to provide four add 

20 or subtract units. Each add or subtract unit may 

perform an addition-based operation on two n bit inputs 
and have an n + 1 bit output. If desired, add- 
subtract -accumulate circuits 144 may be configured to 
pass the outputs of the n/2 bits by n/2 bits multiplier 

25 operation. The outputs of multiplier circuits 136 may 
be routed to output selection circuit 152 or output 
register circuit 154 without being routed through add- 
subtract-accumulate circuits 144 or adder circuit 140. 
Add-subtract-accumulate circuits 144 may produce the 

30 resultant of the addition (or subtraction) of 

particular output pairs of the n/2 bits by n/2 bits 
multiplier operation. 
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[0044] MAC block 192 may include adder circuit 140 

configured to provide two adders. If desired, adder 
circuit 14 0 may pass the inputs that are fed to adder 
circuit 140 from add- subtract -accumulate circuits 144. 
5 The outputs of add- subtract -accumulate circuits 144 may 
be routed to output selection circuit 152 or output 
register circuits 154 without being routed through 
adder circuit 140. Adder circuit 140 may produce two 
outputs that are the resultants of the addition of 
10 particular pairs of outputs from add-subtract- 
accumulate circuits 144 . 

[0045] The local interconnect resources of MAC 

block 192 may be configurable to implement the n/2 bits 
by n/2 bits multiplier based organization with the same 

15 input/output interface circuitry 133 and supporting 
circuitry (e.g., multiplier circuits 136, adder 
circuit 140, etc.) as the n bits by n bits multiplier 
based organization. The local interconnect resources 
of MAC block 192 may be configured to include some 

2 0 butterfly cross connection patterns for forming 

appropriate interconnections in the n/2 bits by n/2 
bits multiplier based organization. 

[0046] The butterfly cross connection patterns are 

implemented for select interconnections between input 

25 register circuits 134 and multiplier circuits 136. The 
butterfly cross connection patterns may be used to have 
the n/2 higher order bits of pairs of n bit inputs 
multiplied together and to have the n/2 lower order 
bits of pairs of n bit inputs multiplied together. The 

30 butterfly cross connection patterns are implemented for 
select interconnections between multiplier circuits 136 
and add-subtract-accumulate circuits 144. As mentioned 
above, add-subtract-accumulate circuits 144 may be 
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configured to include four add (or subtract) units. 
Each add (or subtract) unit may have two n bit inputs 
from multiplier circuits 136. The butterfly cross 
connection patterns may be used to have the two inputs 
5 of each add (or subtract) unit be either the resultant 
of the multiplication of the higher order bits by the 
multipliers of multiplier circuits 136 or the resultant 
of the multiplication of the lower order bits by the 
multipliers of multiplier circuits 136. The butterfly 

10 cross connection patterns may also be used in the 
interconnect between add-subtract -accumulate 
circuits 144 and adder circuit 140. Adder circuit 140 
may be split into two adders (e.g., two independent 
adders) . The butterfly cross connection pattern may be 

15 used to feed the resultant of operations on higher 

order bits to a top half of adder circuit 140 and to 
feed the resultant of operations on lower order bits to 
a bottom half of adder circuit 140. In the n/2 bits by 
n/2 bits multiplier based organization, accumulator 

20 functionality may not be available. Accumulator 
functionality may not be available because the 
resources of MAC block 192 may be substantially 
consumed in allowing for the implementation of the n/2 
bits by n/2 bits multiplier based organization. 

25 [0047] The butterfly cross connection patterns are 

exemplary of techniques for decomposing a single 
multiplier circuit into multiple smaller multiplier 
circuits, exemplary of techniques for managing -data so 
that the outputs of the multiple smaller multiplier 

30 circuits are appropriately added together (e.g., adding 
lower order bits to lower order bits) , or exemplary of 
techniques for managing data to compensate for 
limitations in the resources of a MAC block. Such 
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cross connect patterns may be used to handle 
connections because of the way that circuitry for a MAC 
block was laid down or because of the arrangement that 
was selected for the circuitry. The butterfly cross 
5 connection patterns are provided as an illustrative 
example. Other techniques may also be used. For 
example, the n bits by n bits multipliers may be 
decomposed in a different way that eliminates the need 
for the butterfly cross connection patterns or 
10 decomposed in a way that may require different types of 
cross connect patterns . Accordingly, other cross 
connection or connection patterns may be used to 
implement MAC block 192. 

[0048] The flexibility and configurability of MAC 

15 block 192 may support the configuration of a set of 
modes of operation. If desired, MAC block 192 of 
FIG. 2 and MAC block 192 of FIG. 3 may each be a 
separate embodiment of a MAC block with each having its 
own set of modes of operation. In some embodiments, 

2 0 MAC block 192 may be configurable between having an n 
bits by n bits multiplier based organization or an n/2 
bits by n/2 bits multiplier based organization and 
having modes of operation that are associated with 
each. The modes of MAC block 192 may be configured 

25 with memory bits to make the modes available to users. 
[0049] FIGS. 4-11 are block diagram of illustrative 

implementations of different modes of operation that a 
MAC block according to the present invention may 
support. More particularly, the mode implementations 

30 of FIGS. 4-11 illustrate the components of the host MAC 
block that may be required to implement each respective 
mode. For example, if a particular mode implementation 
requires a single 18 bit by 18 bit multiplier, then the 
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remaining multipliers may be used to implement other 
modes in accordance with the mode splitting features of 
the present invention. 

[0050] FIG. 4 is a block diagram of an illustrative 

5 implementation of an 18 bit by 18 bit multiply mode in 
a MAC block. As illustrated, a single 18 bit by 18 bit 
multiply implementation makes use of one 18 bit by 18 
bit multiplier 404 having multiplicand and multiplier 
inputs 400 and 402 and a product output 406. In a 
10 typical MAC block, the illustrated implementation would 
permit four such multiply modes to be implemented in a 
single MAC block simultaneously, each using one of the 
four available multipliers. In accordance with the 
mode splitting features of the present invention, the 
. 15 remaining three multipliers may be used to implement 

any other suitable mode simultaneously with the 18 bit 
by 18 bit multiply mode that multiplier 404 is being 
used to implement. 

[0051] FIG. 5 is a block diagram of an illustrative 

2 0 implementation of a 52 bit accumulate mode in a MAC 

block. As illustrated, a single 52 bit accumulate mode 
implementation makes use of one 18 bit by 18 bit 
multiplier 504 having inputs 500 and 502 and an 
output 506. Adder/subtracter 508 is used to perform 
25 addition operations to update the running sum stored in 
register 510. Output 512 of register 510 is fed back 
into adder/subtracter 508 to be added with a next 
output 506 of multiplier 504. In a typical MAC block, 
the illustrated implementation would permit two such 

3 0 accumulate modes to be implemented in a single MAC 

block simultaneously, each using one of the four 
available multipliers (i.e., wasting two of the 
multipliers) when using a MAC block such as MAC 
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block 192 of FIG. 2. This is because of the limited 
dedicated arithmetic circuitry available in MAC 
block 192 of FIG. 2. In accordance with the mode 
splitting features of the present invention, however, 
5 the remaining multipliers may be used in implementing 
other modes simultaneously with the accumulating mode 
that do not rely on the dedicated circuitry already 
being used (e.g., an 18 bit by 18 bit multiply mode) . 
This allows more efficient use of MAC block resources. 

10 [0052] FIG. 6 is a block diagram of an illustrative 

implementation of the sum of the products of two 18 bit 
by 18 bit multipliers mode in a MAC block (e.g., the 
two multipliers of either the top half or of the bottom 
half) . As illustrated, a single sum of the products of 

15 two multipliers mode implementation makes use of two 
multipliers 608 and 610 having inputs 600, 602, 604, 
and 606. Products 612 and 614 are input into 
adder/subtracter 616, which provides output 618. In a 
typical MAC block, the illustrated implementation would 

2 0 permit two such sum of the products of two multipliers 

modes to be implemented in a single MAC block 
simultaneously, each using two of the four available 
multipliers. In accordance with the mode splitting 
features of the present invention, the remaining two 
25 multipliers may be used to implement any other suitable 
mode simultaneously with the sum of the products of two 
multipliers mode that multipliers 608 and 610 are being 
used to implement . 

[0053] FIG. 7 is a block diagram of an illustrative 

3 0 implementation of the sum of the products of four 18 

bit by 18 bit multipliers mode in a MAC block. As 
illustrated, a single sum of the products of four 
multipliers mode implementation makes use of four 
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multipliers 716, 718, 720, and 722 having 
inputs 700, 702, 704, 706, 708, 710, 712, and 714. 
Multipliers 716 and 718 may be top half multipliers and 
multipliers 720 and 722 may be bottom half multipliers. 
5 Products 724 and 72 6 are input into first stage 

adder/subtracter 732. Products 728 and 730 are input 
into first stage adder/subtracter 734. Outputs 736 
and 73 8 from adders/subtracters 732 and 73 6 are input 
into second stage adder/subtracter 74 0, which produces 

10 output 742. Because all of the multipliers of the MAC 
block are being used in the implementation shown in 
FIG. 7, no other modes may be simultaneously 
implemented in accordance with the present invention. 
[0054] FIG. 8 is a block diagram of an illustrative 

15 implementation of a 9 bit by 9 bit multiply mode in a 
MAC block. As illustrated, a single multiply mode 
makes use of a single 18 bit by 18 bit multiplier 804 
having inputs 800 and 802 and an output product 806. 
In a typical MAC block (e.g., MAC block 192 in FIG. 3), 

20 the illustrated implementation would permit eight 

such 9 bit by 9 bit multiply modes to be implemented in 
a single MAC block simultaneously (i.e., each 18 bit 
by 18 bit multiplier may be used to implement two 9 bit 
by 9 bit multipliers) . In accordance with the mode 

25 splitting features of the present invention, the 

remaining three 18 bit by 18 bit multipliers and one 9 
bit by 9 bit multiplier may be used to implement any 
other suitable mode simultaneously with the 9 bit by 9 
bit multiply mode that multiplier 804 is being used to 

30 implement. It will be understood that the other modes 
need not involve 9 bit by 9 bit multipliers (i.e., they 
may involve 18 bit by 18 bit multiplication) . 
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[0055] FIG. 9 is a block diagram of an illustrative 

implementation of the sum of the products of two 9 bit 
by 9 bit multipliers mode in a MAC block. As 
illustrated, a single sum of the products of two 9 bit 
5 by 9 bit multipliers mode implementation makes use of 
two multipliers 908 and 910 (e.g., either the two top 
half 18 bit by 18 bit multipliers or the two bottom 
half 18 bit by 18 bit multipliers) having 
inputs 900, 902, 904, and 906. Products 912 and 914 

10 are input into adder/subtracter 916, which provides 
output 918. In a typical MAC block (e.g., MAC 
block 192 in FIG. 3) , the illustrated implementation 
would permit four such sum of the products of two 9 bit 
by 9 bit multipliers modes to be implemented in a 

15 single MAC block simultaneously (i.e., because each 18 
bit by 18 bit multiplier may implement two 9 bit by 9 
bit multipliers) . In accordance with the mode 
splitting features of the present invention, the 
remaining two 18 bit by 18 bit multipliers and two 9 

20 bit by 9 bit multipliers may be used to implement any 
other suitable mode simultaneously with the sum of the 
products of two 9 bit by 9 bit multipliers mode that 
multipliers 908 and 910 are being used to implement. 
[0056] FIG. 10 is a block diagram of an illustrative 

2 5 implementation of the sum of the products of four 9 bit 

by 9 bit multipliers mode in a MAC block. As 
illustrated, a single sum of the products of four 9 bit 
by 9 bit multipliers mode implementation makes use of 
four 18 bit by 18 bit multipliers 1016, 1018, 1020, 

3 0 and 102 2 having 

inputs 1000, 1002, 1004, 1006, 1008, 1010, 1012, 

and 1014. For example, multipliers 1016 and 1018 may 

be the top half multipliers and multipliers 1020 
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and 1022 may be the bottom half multipliers. 
Products 1024 and 1026 are input into first stage 
adder/subtracter 1032. Products 1028 and 1030 are 
input into first stage adder/subtracter 1034. 
5 Outputs 1036 and 1038 from adders/subtracters 1032 
and 1036 are input into second stage 

adder/subtracter 1040, which produces output 1042. In 
a typical MAC block (e.g., MAC block 192 in FIG. 3), 
the illustrated implementation would permit two such 

10 sum of the products of four 9 bit by 9 bit multipliers 
modes to be implemented in a single MAC block 
simultaneously (i.e., because each 18 bit by 18 bit 
multiplier may implement two 9 bit by 9 bit 
multipliers) . In accordance with the mode splitting 

15 features of the present invention, the remaining four 9 
bit by 9 bit multipliers may be used to implement any 
other suitable mode simultaneously with the sum of the 
products of four 9 bit by 9 bit multipliers mode that 
multipliers 1016, 1018, 1020, and 1022 are being used 

20 to implement. It will be understood that if there is a 
lack of resources (e.g., adders), then certain modes 
will may not be implemented simultaneously with that of 
FIG. 10. 

[0057] FIG. 11 is a block diagram of an illustrative 

25 implementation of 36 bit by 36 bit multiply mode in a 
MAC block. Multiplier 1104, having inputs 1100 
and 1102 and output product 906, is built from four 18 
bit by 18 bit multipliers and adders. Because all of 
the multipliers of the MAC block are being used in the 
3 0 implementation shown in FIG. 11, no other modes may be 
simultaneously implemented in accordance with the 
present invent ion . 
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[0058] It will be understood that any other suitable 

modes may be implemented in a MAC block in accordance 
with the present invention. For example, certain modes 
may be implemented without the need for multipliers, 
5 such as 3 6 bit wide XOR gates, AND gates, OR gates, or 
any other suitable logical gates using, for example, 
the arithmetic circuitry of the MAC block. These modes 
may be useful in, for example, support of bitwise 
operations for microprocessors. It will also be 
10 understood that although some modes refer to a "sum", 
any other suitable arithmetic operation may be used 

(e.g., difference) using the adder/subtracter circuitry 
of the MAC blocks. 

[0059] Other modes may include, for example, high 

15 bandwidth 16 bit and 32 bit cyclic redundancy code 

("CRC") calculations. CRC is used in many 
communications protocols to ensure the received data is 
the same as the transmitted data. CRC 

encoding/decoding is relatively simple to implement for 
20 coding 1 bit at a time, but increases in complexity for 
coding multiple bits simultaneously. 
[0060] Because 12 bit by 12 bit multiplication 

requires a full 18 bit by 18 bit multiplier to 
implement, waste of resources results (e.g., only 96 
25 bits if the inputs/outputs are used) . In accordance 
with the present invention, the four 18 bit by 18 bit 
multipliers of each MAC block may support six 12 bit 
by 12 bit multiplications, instead of only four, by 
allowing unused resources to be used in separate 
30 simultaneous modes. This is contrasted with 

implementing 9 bit by 9 bit multipliers from 18 bit by 
18 bit multipliers because the splitting of an 18 bit 
by 18 bit multiplier into two 9 bit by 9 bit 
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multipliers is contained within the 18 bit by 18 bit 
multiplier. Splitting two 18 bit by 18 bit multipliers 
into three 12 bit by 12 bit multipliers involves 
sharing resources between the multipliers. This 
5 requires complex routing and input mapping. 

[0061] For similar reasons, a 24 bit by 24 bit 

multiplication, which would ordinarily require a full 
MAC block to implement in a 36 bit by 3 6 
multiplication, may be made more efficient in 
10 accordance with the present invention to allow a single 
MAC block to support two simultaneous 24 bit by 24 bit 
multiplications . 

[0062] The mode splitting features of the present 

invention may be implemented in any suitable way. For 
15 example, in one suitable approach, a MAC block may be 

configured using appropriate circuitry (e.g., including 
multiplexers, registers, etc.) to allow different modes 
to be implemented simultaneously within the same MAC 
block. Any suitable control signals may be used in 

2 0 order to indicate how a MAC block is to be configured 

with regard to the modes to be simultaneously 
implemented. Any or all of these control signals may 
be user-controlled . 

[0063] FIG. 12 is a simplified block diagram of a 

25 MAC block 1200 according to the present invention 
having control signals 1201-1211. Control 
signals 1201-1211 may indicate in which mode or modes 
MAC block 1200 simultaneously operates. Control 
signals 1201-1211 are merely illustrative. It will be 

3 0 understood that any other suitable control signals may 

be used to implement the mode splitting features of the 
present invention. For purposes of brevity and 
clarity, not by way of limitation, and without loss of 
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generality, the present invention is primarily 
described herein in terms of control signals 1201-1211. 
[0064] Control signals 1201-1204 are "SPLIT" signals 

that indicate for each of the four respective 18 bit 
5 by 18 bit multipliers of MAC block 1200 whether the 
multiplier is to be used as an 18 bit by 18 bit 
multiplier or whether the multiplier is to be used as 
two 9 bit by 9 bit multipliers. Any suitable number of 
SPLIT signals may be used to implement any suitable 

10 mode requiring the use of a particular sized 

multiplier. For example, if a particular mode requires 
the use of 4 bit by 4 bit multipliers, then additional 
SPLIT signals may be used. In another suitable 
approach, SPLIT signals may be used to indicate that a 

15 particular multiplier is be used as two or more smaller 
multipliers (i.e., as opposed to be split into only two 
smaller multipliers) . 

[0065] Control signals 1205 and 1208 represent 

"SMODE" signals that may be used to indicate whether 

20 the accumulator functionality of MAC block 1200 is to 

be enabled. Thus, control signals 1205 and 1208 may be 
used to implement a 52 bit accumulate mode. Control 
signal 1205 may be associated with the top half of MAC 
block 1200 whereas control signal 1208 may be 

25 associated with the bottom half of MAC block 1200. 
[0066] Control signals 1206 and 1209 represent 

"ZERO" signals that may be used to indicate, together 
with the SMODE signal, whether mode 2A is to be 
implemented. Mode 2A is used to initialize (e.g., by 

3 0 zeroing) the accumulator components used in mode 2 

(i.e., 52 -bit accumulator mode described with reference 
to FIG. 5 above). With reference to FIG. 2, control 
signals 1206 and 1209 may, for example, cause 
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appropriate bits of either top half 
adder/ subtracter 14 4 or bottom half 
adder/subtracter 144, respectively, to be tied to 
ground in order to zero the accumulator. Control 
5 signals 1206 and 1209 may also be used to indicate, in 
one particular arrangement, a 36 bit by 36 bit 
multiply. That is, in a preferable implementation of 
a 3 6 bit by 3 6 bit multiply mode, zeroing of the 
adder/subtracters may be necessary. 

10 [0067] Control signals 1207 and 1210 represent 

"MODE 3 " signals that may be used to indicate when the 
outputs of two multipliers (i.e., either two in the top 
half or two in the bottom half of the MAC block) are to 
be added together. Control signals 1207 and 1210 are 

15 therefore used to indicate when the sum of two 18 bit 
by 18 bit multiplications mode is to be implemented or 
when the sum of two 9 bit by 9 bit multiplications mode 
is to be implemented. Control signals 1207 and 1210 
are associated with the top half and the bottom half of 

20 the MAC block, respectively. 

[0068] Control signal 1211 represents a "MODE 4 " 

signal that may be used to indicate when the outputs of 
four multipliers are to be added together. Control 
signal 1211 is therefore used to indicate when the sum 

25 of four 18 bit by 18 bit multiplications mode is to be 
implemented or when the sum of four 9 bit by 9 bit 
multiplications mode is to be implemented. Because all 
four multipliers of the MAC block are used in these 
modes, a single MODE 4 signal is used for the entire MAC 

3 0 block. 

[0069] Table 1, below, summarizes the above control 

signals as used to implemented each of the respective 
modes described. A, B, C, and D represent each of the 
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four 18 bit by 18 bit multipliers in the MAC block, 
with A and B being the top half multipliers and C and D 
being the bottom half multipliers. R and S represent 
the top half and bottom half of the MAC block. 
5 [0070] TABLE 1 
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[0071] Each of modes 1-8 in TABLE 1 may be 

implemented in either a single 18 bit multiplier, in 
half of the MAC block (i.e., either the top half or the 
10 bottom half) , or in the entire MAC block. TABLE 2 
summarizes this flexibility below. 



[0072] TABLE 2 



MODE 


Description 


Per 
Multiplier 


Per half 
MAC block 


Per MAC 
block 


1 


18bxl8b multiply 


X 






2 


52 bit accumulate 




X 




2A 


Initialize/ Zero 
Accumulator 




X 




3 


Sum of 2 18bxl8b 
multiply 




X 




4 


Sum of 4 18bxl8b 
multiply 






X 


5 


9bx9b multiply 


X 
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6 


Sum of 2 9bx9b 
multiply 




X 




7 


Sum of 4 9bx9b 
multiply 






X 


8 


36bx36b multiply 






X 



[0073] If a particular mode requires half of the MAC 

block, then the other half may be used by a half block 
mode or by a single multiplier mode (or two such 
5 modes) . If a particular mode requires a single 

multiplier, then the remaining multipliers may be used 
by a half block mode, by single multiplier modes, or by 
both. 

[0074] It will be understood that certain 

10 arrangements of a MAC block in accordance with the 

present invention may involve certain consequential and 
practical restrictions. For example, in one suitable 
arrangement, modes 4, 7, and 8 require control 
signals 1201-1204 to be set to the same value. Control 

15 signals 1205 and 1208 may be required to be set to the 
same value for modes 4, 7, and 8. Control signals 12 06 
and 12 0 9 may be required to be set to the same value 
for modes 4, 7, and 8. Modes 3 and 6 may require 
control signals 1201 and 1202 to be set to the same 

20 value and control signals 1203 and 1204 to be set to 
the same value. It will be understood that such 
restrictions are merely illustrative and depend at 
least in part on the particular arrangement used, the 
application for which the MAC block will be used, or 

2 5 both. 

[0075] It will be understood that any other suitable 

modes may be represented and implemented according to 
the present invention. It will further be understood 
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that any other control signals may be used in addition 
to or in place of those illustrated. 

[0076] FIG. 13 is a simplified block diagram of a 

programmable logic device 13 00 having one or more MAC 
5 blocks 1302 configured in accordance with the present 
invention. PLD 13 0 0 may have any suitable 
interconnection circuitry, memory circuitry, and 
programmable logic circuitry to allow PLD 1300 to 
implement user designs and to make use of MAC 

10 blocks 1302 in implementing the user designs. 

[0077] FIG . 14 illustrates a PLD 1300 (FIG. 13) of 

this invention (i.e., having at least one multiplier 
configured with the mode splitting features of the 
present invention) in a data processing system 1400 in 

15 accordance with one embodiment of the present 

invention. Data processing system 1400 may include one 
or more of the following components: a processor 1402; 
memory 1404; I/O circuitry 1406; and peripheral 
devices 1408. These components are coupled together by 

20 a system bus 1410 and are populated on a circuit 
board 1412 which is contained in an end-user 
system 1414. 

[0078] System 1400 may be used in a wide variety of 

applications, such as computer networking, data 

25 networking, instrumentation, video processing, DSP, or 
any other application where the advantage of using 
programmable or reprogrammable logic is desirable. 
PLD 13 00 may be used to perform a variety of different 
logic functions. For example, PLD 1300 may be 

3 0 configured as a processor or controller that works in 
cooperation with processor 1402. PLD 1300 may also be 
used as an arbiter for arbitrating access to a shared 
resource in system 1400. In yet another example, 
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PLD 13 00 may be configured as an interface between 
processor 14 02 and one of the other components in 
system 1400. 

[0079] Thus, a MAC block having mode splitting 

5 capabilities is provided. One skilled in the art will 
appreciate that the present invention can be practiced 
by other than the described embodiments, which are 
presented for purposes of illustration and not of 
limitation, and the present invention is limited only 
10 by the claims which follow. 



